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Abstract: An urban area are those locations where there is an opportunity for a diversified living environment, 
diverse lifestyle, people live, work, enjoy themselves in social and cultural relationships provided by these 
proximities of an urban area. But urban roads are substantive in urban areas for communication purpose. 
Subsequently India has a populated country it has a second largest road network in the world. Owing to 
boastfully population, the congestion is growing at zip, zap, zoom speed as thousands of heterogeneous vehicles 
are added to the urban roads in India, afterward India is a uprose country, the level Of Service (IOS) is not 
passable defined. According to HCM 2000 , the IOS criteria of roads delineate for homogeneous traffic flow. 
Afterward india has a modernized country having heterogeneous traffic flow occurs with different operational 
features. Delineating IOS is essentially a compartmentalization trouble. The diligence of clustering analysis is 
the worthiest proficiency to lick the problem. Adaboost algorithm is large margin learning algorithm used to 
lick the compartmentalization trouble. Five cluster validation parameter is utilized to square up the optimal no 
clusters. After acquiring optimal no of clusters , adaboost algorithm was first implemented to the free flow 
speed data to get FFS ranges of different urban street class. Adaboost algorithm method was enforced to 
average travel speed data to specify the speed ranges of different IOS categories which have ascertained to be 
lower than evoked HCM 2000 on account of physical and surrounding environmental characteristics. 
KEY WORDS: Urban roads, level of service (IOS), Clustering analysis, adaboost algorithm, Free flow speed 
(FFS), Average travel speed. 



I. INTRODUCTION 

The simplest definition of urbanism or an urban area might be: confederation or union of neighboring 
clans resorting to a centre used as a common meeting place for worship, protection, and the like; hence, the 
political or sovereign body formed by such a community. An urban area can also be defined as a composite of 
cells, neighbourhoods', or communities where people work together for the common good. The types of urban 
areas can vary as greatly as the varieties of activities performed there: the means of production and the kinds of 
goods, trades, transportation, the delivery of goods, and services, or a combination of all these activities. A third 
definition says that urban areas are those locations where there is opportunity for a diversified living 
environment and diverse lifestyle. People live, work and enjoy themselves in social and cultural relationships 
provided by these proximities of an urban area. Urban areas can be simplex or complex. They can have a rural 
flavour or that of an industrial workshop. They can be peaceful or filled with all types of conflict. They can be 
small and easy to maintain, or gargantuan and filled with strife and economic problems. 

Urban streets are unique and different from rural or suburban streets, urban street design include traffic 
flows, speed, volume and it also affects the planning, design, and operational aspects of transportation projects 
as well as the allocation of limited financial resources among competing transportation projects. So there is a 
bare need of determining the level of service criteria in Indian context. Urban street level of service mainly 
calculated from field data of HCM. Floating car method is the most common technique to acquire speed data. In 
this method as a driver drives the vehicle a passenger records the elapsed time information at predefined check 
points. This recording of elapsed time can be done by pen and paper, audio recorder or with a small data 
recording device. This method is advantageous as require very low skilled technician and very minimal cost 
equipments. It has a drawback as it is a labour intensive method. Human errors that often result from this labour 
intensiveness include both recording errors in the field and transcription errors as the data is put into an 
electronic format (Turner, et. al, 1998). With improvement in portable computers Distance Measuring 
Instrument (DMI) came as the solution for floating car method. DMI measures the speed distance using pulses 
from a sensor attached to the test vehicle's transmission (Quiroga and Bullock, 1998). This method also has 
some limitations like very complicated wiring is required to install a DMI unit to a vehicle. Frequent calibration 
and verification factors unrelated to the unit are necessary to store making the data file large and which leads to 
data storage problem. (Turner, et. al, 1998; Benz and Ogden, (1996). The development of Information 
Technology and advancement of Global Positioning System (GPS) has largely overcome the data quality and 
quantity shortcomings of the manual and DMI methods of collecting travel time data .become one of the 
alternatives to moving car observer method for field data collection. In this method a GPS receiver mounted on 
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a vehicle automatically records location of urban corridor and speed at regular sampling interval. A single 
technician can perform this task quite easily with accuracy. The field data collected through GPS receiver 
should refer to a geographical position and additional tool is required for this purpose. Geographical Information 
System (GIS) comes as the solution for this problem with an advantage of assigning the parameters received 
through GPS to existing geographical data base. This automated procedure provides convenience, consistency, 
finer precession and accuracy than the conventional procedure. This automated procedure helps to collect large 
amount of travel time and speed data. 

Urban streets are unique and different from rural or suburban streets, urban street design include traffic 
flows, speed, volume and it also affects the planning, design, and operational aspects of transportation projects 
as well as the allocation of limited financial resources among competing transportation projects. So there is a 
bare need of determining the level of service criteria in Indian context. Urban street level of service mainly 
calculated from field data of HCM. Floating car method is the most common technique to acquire speed data. In 
this method as a driver drives the vehicle a passenger records the elapsed time information at predefined check 
points. This recording of elapsed time can be done by pen and paper, audio recorder or with a small data 
recording device. This method is advantageous as require very low skilled technician and very minimal cost 
equipments. It has a drawback as it is a labour intensive method. Human errors that often result from this labour 
intensiveness include both recording errors in the field and transcription errors as the data is put into an 
electronic format (Turner, et. al, 1998). With improvement in portable computers Distance Measuring 
Instrument (DMI) came as the solution for floating car method. DMI measures the speed distance using pulses 
from a sensor attached to the test vehicle's transmission (Quiroga and Bullock, 1998). This method also has 
some limitations like very complicated wiring is required to install a DMI unit to a vehicle. Frequent calibration 
and verification factors unrelated to the unit are necessary to store making the data file large and which leads to 
data storage problem. (Turner, et. al, 1998; Benz and Ogden, (1996). The development of Information 
Technology and advancement of Global Positioning System (GPS) has largely overcome the data quality and 
quantity shortcomings of the manual and DMI methods of collecting travel time data .become one of the 
alternatives to moving car observer method for field data collection. In this method a GPS receiver mounted on 
a vehicle automatically records location of urban corridor and speed at regular sampling interval. A single 
technician can perform this task quite easily with accuracy. The field data collected through GPS receiver 
should refer to a geographical position and additional tool is required for this purpose. Geographical Information 
System (GIS) comes as the solution for this problem with an advantage of assigning the parameters received 
through GPS to existing geographical data base. This automated procedure provides convenience, consistency, 
finer precession and accuracy than the conventional procedure. This automated procedure helps to collect large 
amount of travel time and speed data. 

The overall framework of this study is illustrated in Figure 1 
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Figure 1 Overall Frame Work of The Study 

IT. LITERATURE REVIEW 

The level of service (LOS) concept was first introduced in 1965 version of highway capacity manual. in this 
concept, it was recognized that the driver's view of the transportation system is also important to consider.it has 
become more important to estimate not only the LOS but also key operational performance measures like queue 
length or average speed.it has become important to expand the analysis area from a single point to a segment, 
and then form a linear segment to a two dimensional area. And then ultimately on found a single, integrated 
multimodal transportation system. The 1950 HCM was written exclusively to address the needs of traffic 
engineers who were participating in planning, building, and operating specific roadway components. According 
to 1965 HCM, level of service described by six classes from "A" to "F" defined based on the combination of 
travel time and the ratio of traffic flow rate to the capacity road sections. The "1965 HCM' concept was 
redefined to several traffic conditions in the 1985 version of highway capacity manual include travel speed, 
traffic flow rate and traffic density of each type of roads. There was limitation to 1985 HCM that LOS measure 
which was given by baumgaertner(1996),cameron(1996) and brilon (2000). Baumgaertner(1996) spotted out the 
continous growth of urban populations, vechile ownership, average trip length, and number of trips has resulted 
in a relative increase in traffic volumes. Thus, travel condition that would have been viewed as intolerable in the 
1960s are considered normal by today's motorists, especially commuters cameron(1996) stated that it was not 
uncommon to wait three minutes as a congested urban intersection with average delays often exceeding two 
minutes, later various research have been done for describing six level of service designations to nine or more. 
The LOS criterion of roads in urban street is basically classification problems and cluster analysis is a suitable 
technique for that classification. Large amount of free flow speed(FFS) and average travel speed data are needed 
for cluster analysis because LOS of urban street is a function of travel speed along street segments. The travel 
speed data were collected by using floating car method. Turner, et al, (1998) found that the method has 
produced the susceptible human error. Later distance measuring instrument (DMI) was introduced as a solution 
of floating car method. Benz and ogden(1996) found limitation to this method because it has produced 
difficulties in DMI unit and data storage problems. The HCM describes "level of service" is a qualitative 
measure that describes traffic conditions in terms of speed, travel time, freedom to maneuver, comfort and 
convenience, traffic interruptions and safety. Six classification are used to define LOS, designated by the letters 
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"A" to "F". where LOS A represents the best conditions, while LOS F represents heavily congested flow with 
traffic demand exceeding highway capacity. 

Kittelson and roess(2001) have noted down that the HCM(2000) methodologies have not been based 
upon user perception surveys. The HCM(2000) methodologies have resulted from a combination of consulting 
studies, research, debates and discussions of the highway capacity and quality of service(HCQS) 
committee(pecheux et.al,2000). in july 2001, at the midyear meeting of the HCQS committee, a motion was 
passed that stated "the committee recognizes that there are significant issues with the current LOS structure and 
encourages investigations to address these issues". (pecheux et al; 2004). Flannery et al; (2005) while relating 
quantitative to qualitative service measuring methods for urban streets found that LOS calculated by 
HCM(2000) methodology, predicted 35% Of the variance in mean driver rating. Brilon and estel(2010) have 
presented standardized methods that allow a differentiated evaluation of saturaion of flow (LOS F) conditions 
beyond a static considerations of traffic conditions in german highway capacity manual. According to indian 
roads congress(IRC 1990) , for an urban roads, the LOS are strongly affected by factors such as heterogeneity of 
traffic, speed regulations, frequency of intersections, presence of bus stops, on-street parking, road side 
commercial activities and pedestrian volumes etc. 

The level of service (LOS) concept differs from country to country. In North America the LOS is 
divided into six categories. where, 

A= free flow 

B= reasonable free flow 

C= stable flow 

D= approaching on stable flow 

E= unstable flow 

F= forced or break-down flows. 

The LOS concept also suited in the country UK and australia. In australia the LOS are an integral 
component of asset management plans. The HCM 2010 version corporates tools for multimodal analysis of 
urban streets. The primary basis for the new multimodal LOS procedures on urban streets is NCHRP report 616: 
multimodal level of service of analysis for urban streets. The research have been developed for evaluating 
multimodal level of service(MMLOS) provided by different urban street designs and operations. The 
researchers can use the (MMLOS) to evaluate various street designs in terms of their effects on the auto driver's, 
transit passengers, bicyclist, and pedestrian's perceptions. 

Adaboost was introduced in 1995 by freund and schapire.it is a well known large margin learning algorithm that 
can select a small set of the most discriminative features and combines them into an ensemble classifier and also 
an additive model., viola-jones's work(robust real time object detection)made adaboost learning world focus in 
the community of computer vision and pattern recognition. It is a meta algorithm and can be used in conjuction 
with many other learning algorithms to improve their performance, adaboost is a adaptive in the sense that 
subsequent classifier built area tweaked in favour of those instances learning algorithm. 

III. ADABOOST CLUSTERING METHODOLOGY 

Adaboost was introduced in 1995 by freund and schipre. It is a well known large margin learning 
algorithm that can select a small set of most discriminative features and combines them into an ensemble 
classifier. Viola- Jones's work made Adaboost learning world facous in the community of computer vision and 
pattern recognition. 

The mechanics of adaboost learning algorithm includes three fundamental points such as weak learner, 
the component classifier and the re-weighting function. 

> Weak learner: The weak learner is essentially the criterion for choosing the best feature -t on the weighted 
training set. 

> The component classifier: The component classifier outputs the confidence of a sample being a positive 
based on its t value. 

> Sample re-weighting: Sample re-weighting enables that the subsequent component classifier can 
concentrate on the hard samples by assigning higher weights to the samples that are wrongly classified by 
previous component classifiers. 

AdaBoost is one of the most influential ensemble methods. Its birth was originated from the answer to an 
interesting question posed by Kearns and Valiant in 1988. That is, whether two complexity classes, weakly 
learnable and strongly learnable problems, are equal. AdaBoost and its variants have been applied to diverse 
domains with great success, owing to their solid theoretical foundation, accurate prediction andgreat simplicity 
(Schapire said it needs only "just 10 lines of code"). For example, Viola and Jones combined AdaBoost with a 
cascade process for face detection. They regarded rectangular features as weak learners, and by using AdaBoost 
to weight the weak learners, they got very intuitive features for face detection. In order to get high accuracy as 
well as high efficiency, they used a cascade process. As the result, they reported a very strong face detector: On 
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a 466MHz machine, face detection on a 384£288 image cost only 0.067 seconds, which is 15 times faster than 
state-of-the-art face detectors at that time but with comparable accuracy. 

The adaboost algorithm was described by viola and jones in 2001. 

> Step 1 : Given a set of training samples (xl,yl),(x2,y2) (xn,yn) where yi = 0 for negative sample, yi = 1 

for positive sample. N is number of total training example. 

> Step 2 : Initialize weights Wl,i = D(i), for negative D(i) = l/(2m), where m is number of negative samples. 
For positive D(i) = 1/(21), where 1 is number of positive samples, m + 1 = N. 

> Step 3 : for t= 1 to T, 

A= normalise the weights 
w ■ 



a ,,i = 

7=1 

B= For each feature, f, train a classifier h(x, f, p, 9) The error is evaluated with respect to qt : 

i 

C: Choose the classifier, ht, with the lowest error □: 

^ t = min fp9 2 q t \h(x t ,f,p,6)-y\ = £ q, \h(x t , f t , p t , 6> ) - y i | 

/ i 

h t (x) = h(x,f t ,p t ,0 t ) 
D: Update the weights: 

T T 

h{x) = {1 ^ a t h t (x) > 1 / 2^ a, and 0 otherwise. 
Where a, = log 



J, 



IV. VALIDITY INDEX 

Different validity indices have been suggested for the study, but none of them is blemish by oneself, and 
consequently various indices have employed in this study, such as:- Homogeneity-separation index(HSI), Rand 
index(RI), Adjusted rand index(ARI), Hubert index(HI) and Mirkin index(MI). 
A) Homogeneity-Separation Index (HSI) 

The Homogeneity is calculated as the average distance between each gene expression profile and the centre of 
the cluster it belongs to. That is, 



N 

gene l 



Where g t is the /thgene and C(g t ) is the centre of the cluster that g t belongs to; N gene is the total number of 
genes; D is the distance function. 



Separation is calculated as the weighted average distance between cluster centres: 



i*i 

Where C t and C ■ are the centres of i th and j th clusters, and A^ d and N c j are the number of genes in the i th 
and j th clusters. Thus H ave reflects the compactness of the clusters while S ave reflects the overall distance 
between clusters. Decreasing H ave or increasing S ave suggests an improvement in the clustering results. 



B) Rand Index (RI) 

Rand Index is the fraction of agreements with respect to element pairs that are either clustered together in both 
clustering and clustered apart in both clustering. 
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N + N 
j^j _ iV n T iV oo 



N n + N 00 + N 01 + N 10 

Where N l j = number of pairs of points clustered together in both clustering. 
N QQ = number of element pairs that both clustering did not cluster together. 
N Ql = number of pairs clustered in second but not first clustering. 
iV 10 = number of pairs clustered in first but not second clustering. 

C) Adjusted Rand Index (ARI) 

The Rand Index has been adjusted such that the normalized index has expected value Oand value cannot exceed 

ARI = RI ~ E[R] 
l-E[R] 

D) Hubert Index (HI) 

Hubert F is defined as 



T = 



N(N-l)' 

Where d ik = distance between elements i and k. 

Cl ik = distance between clusters to which elements i and k belong (represented by centroids.) 
Entropy: 

Assuming that a point has equal probability of belonging to any cluster, the entropy of a clustering is defined as 

H(C)=-^l iP (i)logp(i) 

Where p(i)= — 
n 

k = number of clusters. 
E) Mirkin Index (MI) 

The Mirkin index which is also known as Equivalence Mismatch Distance is defined by 

m(c,C)= f]c\ +f\c;\ -2X2X 2 

i=l 7=1 i=l 7=1 

It corresponds to the Hamming distance for binary vectors if the set of all pairs of elements is enumerated and a 
clustering is represented by a binary vector defined on this enumeration. An advantage is the fact that this 
distance is a metric on p(x) .However, this measure is very sensitive to cluster sizes such that two clustering 
that are" at right angles" to each other (i.e. each cluster in one clustering contains the same amount of elements 
of each of the clusters of the other clustering) are closer to each other than two clustering for which one is a 
refinement of the other. 



V. RESULTS 

Adaboost Algorithm 

The free flow speed data acquired through GPS receiver was clustered using the Adaboost Algorithm.. 
For determination of the parametric value of validation measures, free flow speed data were used. In this 
research five validation parameters were used. Value of validation parameters were obtained for 2 to 10 number 
of cluster and were plotted in Figure 2 (A) to Figure 2 (E). 

These Five number of validation parameters were used to know the optimum number of cluster for this 
particular data set of free flow speed. By knowing the optimum number of cluster we can classify the urban 
street segments into that number of Urban street classes. It is always considered that lesser number of clusters is 
better if variation in validation parameters is minimal. Literature says that the highest value of Homogeneity- 
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Separation Index(HIS) signifies the optimal number of cluster for a particular set of data. Figure 2 (A) shows 
that the index are highest for 3 number of clusters. Also, available literature says that the highest value of Rand- 
Index(RI), Adjusted Rand Index(ARI) gives the optimal number of cluster for a given data set; which is 4 as 
shown in Figure 2 (B)&(C). ForMirkin Index(MI), the optimal number of cluster is that point from where the 
Index vs. Number of cluster graph goes Upward. Figure 2(D) shows the Mirkin Index(MI). The Hubert 
Index(HI) gives that the highest value is the optimum number of clusters. Figure 2(E) describes the highest 
value is the optimum number of clusters. Out of five validation parameters considered in this study four 
parameters give the optimal cluster value as 4 which is also same as suggested by HCM-2000. That is the reason 
for which in this research the urban street segments were classified into four Classes by using the Adaboost 
Algorithm. 
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Figure 2 Validation measures for optimal number of clusters using Adaboost Method 



Figure 3 shows the speed ranges for different urban street classes. Different symbol in the plot used for 
different urban street class. It is observed from the collected data set that when a street segment falls under 
particular urban street class is agreed with the geometric and surrounding environmental condition of the road 
segments as well. 
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It has been found that there is very good correlation between free flow speed and geometric and 
environmental characteristics of streets under considerations. 
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Figure 3 ADABOOST Clustering of FFS for Urban Street Classification 

After classification of urban streets into number of classes, direction wise average travel speed on street 
segments during both peak and off peak hours were clustered using Adaboost Algorithm to find the speed range 
of level of service categories. In fig. 4 the speed values are shown by different symbols depending on to which 
LOS category they belong. The legends in fig. 4 (A-D) gives the speed ranges for the six LOS categories 
obtained by using Adaboost clustering. The speed ranges for LOS categories found using Adaboost clustering is 
also shown in Table 1 . 
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Figure 4 Level of service of urban street classes (I-IV) using Adaboost clustering on average travel speeds 

Table 1 Urban Street Speed Ranges for different LOS Proposed in Indian Conditions by ADABOOST 

Method 



Urban Street Class 


I 


II 


III 


IV 


Range of Free Flow Speed 
(FFS) 


65 to 95 km/h 


49 to 65 km/h 


33 to 49 km/h 


25 to 33 km/h 


Typical FFS 


72km/h 


58km/h 


39km/h 


27 km/h 


LOS 


Average Travel Speed (Km/h) 


A 


>68 


>52 


>45 


>32 


B 


>55-68 


>45-52 


>37-45 


>25-32 


C 


>40-55 


>40-45 


>33-37 


> 14-25 


D 


>31-40 


>28-40 


>25-33 


>10-14 


E 


>22-31 


>13-28 


>ll-25 


>8-10 


F 


<22 


<13 


<11 


<8 



The coherence of the clustering result in the classification of urban streets into four classes and speed 
values into six LOS categories is verified with the road inventory data that was collected during survey. The 
road inventory data considered here are number of lanes, median type, access density, road side development, 
on street parking and pedestrian activity. These geometric characteristics and surrounding environmental 
characteristics provides sufficient information on physical characteristics of each road segments. Geometric and 
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surrounding environmental characteristics were checked to know whether the experienced LOS on a particular 
road segment during a travel run actually satisfies to the field condition or not. From this analysis it is inferred 
that on a segment having good traffic flow, geometric and surrounding environmental conditions good LOS 
"A", "B", "C", can be experienced by the commuter. The commuter have to experience poor LOS like "D", "E", 
and "F" when traffic flow condition is poor and surrounding environmental conditions are not so good. So travel 
speed the parameter used to determine LOS is quite capable in defining LOS categories for a particular urban 
street segment. 

VI. SUMMURY AND CONCLUSION 

In this research, an endeavor has been made to delineate the LOS criteria of urban roads in Indian 
circumstance for heterogeneous traffic flow stipulation. The probe vechile accommodated with a Geo-XT GPS 
receiver was made to run several times to accumulated large quantity of speed data. Five important road 
corridors of city of Mumbai in the maharastra state, india are fastened on for this research. Five number of 
cluster validation parameters were employed to acquire optimal number of cluster by utilizing the FFS data for 
the classification of urban streets into number of classes. After acquiring the optimal number of clusters, then 
newly sprang up adaboost clustering technique is employed twice in this research. First, to classify the urban 
street into worthiest number of classes by utilizing the FFS data and finally implemented for clustering, the 
direction wise average travel speed on street segments during both peak and off-peak hours to find the speed 
ranges of different LOS categories. From this research, it was witnessed that the urban street speed ranges valid 
in Indian context are proportionately lower than the gibing values adverted in Highway Capacity Manual (HCM 
2000). In HCM 2000, the FFS ranges are (90-70)km/hr, (70-55)km/hr, (55-50)km/hr and (55-40)km/hr for class 
I,II,III,IV respectively. Where as, by enforcing the adaboost clustering in the FFS data, it resolve that the speed 
ranges are (65-95)km/hr, (49-65)km/hr, (33-49)km/hr and (25-33)km/hr, which is significantly lower than 
adverted HCM 2000. These lower values are due to highly heterogeneous traffic flow on urban roads with 
motleying geometry and surrounding environmental features. The study can urge protracted for uprise the new 
models which is appropriate to delimitate LOS criteria in Indian circumstance. 
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